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Field of invention 

The present invention relates to various uses of spilt fluorophore complementation In 
relation to Inter alia screening for dnjgs capable of modulating protein-protein interactions, 
screening for partidpants in specific protein-protein Interactions, screening for protein- 
protein Interactions nraduiated by specific compounds. 



Baclcground of tiie invention 

it has been suggested to use ttie reassembly of certain enzyme fragments to the 
complete enzyme as a measure of protein-protein interactions. PNAS 91, 1994 - 
Johnsson disdoses reassembly of Ubiquitin. This reassembly Is detected through the 
10 irreversible deavage of the fuskxi by Ubiquitin protease and release of a Reporter. As 
opposed to the two4>ybr{d technique, this technlqi» Indudes the p(»slbiiity of immitoring 
a prot^n-proteln interaction as a function of time, at ttie natural sites of tills interadion In a 
living cell. 

A similar system Is suggested for Vne reassembly of p-galactosidase (PNAS 94. 1997 - 
15 Rossi). DHFR (WO98/34120) and Green Ruorescent Protein (GFP) <J.Am.Chem.Soc 
122, 2000 - Ghosh and WO01/87919). The basic concept Is that by splitting a functional 
protein In two firagments, ttie function Is lost The two fragments are then transfected into 
cells fused In frame to proteins X and Y. respectively. Binding between proteins X and Y 
will bring ttie two firagments so close togettier ttiat ttie functional protein win regain its 
20 function. If tiie function is DHFR, the cells will sunrtve only If proteins X and Y bind to each 
otiier. If ttie function is fluorescence, ttie cells will emit light upon excitetlon only If protein 
X and protein Y bind to each otiier. 

The present Invention describes improvements and new uses of tills reassembly 
technique, the technique of fluorescence complementetion. 



25 Detailed disclosure 

When tile word "fluorophore" is used in ttie present application, it is meant to Indicate a 
fluorescent protein, that Is a protein that, when expressed by a ceB. emite fluorescence 
upon exposure to light of tiie conrect excitation wavelength (e.g. as dtescribed by Chalfle, 
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M. et al. (1994) Science 263, 802-805). One or more amino adds of the fluorophore may 
have been substituted, inserted or deleted. Fluorophore, as used herein, includes Inter 
alia wild-type Green Fluorescent Protein (GFP) derived from the jelly fish Aequoraa 
Victoria . or from other members of the Coelenterata, such as the red fluorescent protein 

5 from Discosoma sp. (Matz, M.V. of al. 1999, Nature Biotechnology 17: 96&«73). GFP 
from Renilla reniformis, GFP from Renilla Mue/teri or fluorescent proteins ftom other 
animals, fungi or plants. The term also IncludeSmodlflcatlons of GFP, such as the blue 
fluorescent variant of GFP disclosed by Helm et al. (Helm. R. ef al., 1994. 
Proc.Nati.Acad.Sci. 91:26, pp 12501-12504), and other modifications that change the 

1 0 spectral properties of the GFP fluorescence, or modiflcations tiiat exhibit Increased 
fluorescence vortien expressed in cells at a temperature above about 30°C descnljed In 
PCT/DK96/00051 , published as WO 97/1 1094 on 27 March 1997. and that comprises a 
fluorescent protein derived from Aequorea Green Ruorescent Protein or any functional 
analogue thereof, virtierein ttie amino add in position 1 upstream from ttw chromophore has 

1 5 been mutated to provide an Increase of fluorescence Intensity when the fluorescent protein 
of the Invention is expressed in celte. 

Numerous cell systems for transfection exist. A few examples are Xenopus oocytes or 
insect cells, such as the sf9 cell line, or mammaHan cells isolated directly from tissues or 

20 organs teiten from healttiy or diseased animals (primary cells), or transformed mammalian 
ceHs capable of indefinite replication under cell culture conditions (cell lines). However, it 
is preferred that ttie cells used are mammalian cells. This is due to tiie complex 
biochemical interactions spedfic for each cell type. The terni "mammalian ceil" is Intended 
to indicate any living cell of mammalian origin. The cell may be an established cell line. 

25 many of which are available from The American Type Culture Collection (ATCG. Virginia, 
USA) or similar Cell Culture Collections. The cell may be a primary cell with a limited life 
span derived from a mammalian tissue, induding tissues derived from a transgenic 
animal, or a newly established immortal cell line derived from a mammalian tissue 
induding transgenic tissues, or a hybrid cell or cell line derived by fusing dlfliefent cell 

30 types of mammalian origin e.g. hybridoma cell lines. The ceHs may optionally express one 
or more non-native gene products, e.g. receptors, enzymes, enzyme substrates, prior to 
or in addition to tt^e fluorescent probe. Prefemed cell lines Indude. but are not limited to, 
ttiose of fibroblast origin, e.g. BHK, CHO, BALB, NIH-3T3 or of endothelial origin. e.g. 
HUVEC, BAE (bovine artery endottielial), CPAE (cow pulmonary artery endothelial). 

35 HLMVEC (human lung micro vascular endothelial cells), or of airway epithelial origin, e.g. 
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BEAS-2B, or of pancreatic origin. e.g. RIN, INS-1. MIN6. bTC3. aTC6. bTC6. HIT. or of 
hematopoietic origin. e.g. primary isolated human monocytes, macrophages, neutrophils, 
basophils, eosinophils and lymphoc^ populations. AML-14, AML-193, HL-60, RBL-1. 
U937, RAW, JAWS, or of adipocyte origin, e.g, 3T3-L1 , human pre-adipocyfes, or of 
5 neuroendocrine origin, e.g. AtT20, PC12. GH3. muscle origin. e.g. SKMC. A10. C2C12. 
renal origin, e.g. HEK 293. LLC-PK1. or of neuronal origin. e.g- SK-N-DZ. SK-N-BE(2). 
HCN-1A,NT2/D1. 

The non-fluorescent fragments of flu«escent proteins that can be combined to form one 



10 functional fluorescent unltxusually produced by splitting the coding nudeoUde sequence 
of one fluorescent protein at an appropriate site and expressing esxHn nucleotide 
sequence fragment fndependenUy. The fluorescent protein fragments may be expressed 
alone or in fusion with one or more protein fusion partners. Each translated sequence 
must contain appropriate start and stop codons and some residues In the fluorescent 
1 5 protein may be encoded by both coding sequences whereas other residues may not be 
encoded by any of the coding sequences. 

We have data on hand suggesting to spilt EGFP at amino acid 157-158, 172-173, or at 
amino acid144-145. Based on these finding^t is concluded that appropriate splitting sites 
in GFP are located in the loop regions between the residues that fomi the beta-sheet 

20 structures of the GFP beta-barrel. Accordingly, splits in GFP must be made next to or 
close to residues 23, 39, 51, 102. 116. 157, or 173 or within, next to or dose to the 
regions defined by residues 76-90, 129-144. 189-197, or 209-215 (Fig. 1). All residues are 
numbered according to the numbering of wild type A. victoria GFP (Genbanit accession 
no. IM62653) and said numbering also applies to equivalent positions in homologous 

25 sequences exemplified by alignment #1 of fluorescent protein sequences. 

The choice of split site for a particular assay. dependSon the properties needed for the 
fluorophore as it Is presently assumed that the various split sites will have various 
Influences on the different speed of fdding. different intensity etc of the maturing or 
matured fluorophore. 

30 Thus, one aspect of the Invention relates to a method for generating a library of interacOng 
proteins v^ln ilvhg cells consisting of: 
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1 . Introducing Into a pool of cells two sets of plasmlds. either simultaneously or 
sequentially, one set of plasmlds encoding a library of proteins A each fused to the N- 
temiinai half of the complementing fluorophore and the second set of plasmlds encodir^ a 
library of proteins B each fused to the C-terminai half of the complementing fluorophore. 

5 2. Sorting the cells Into those where a ftjnctional fluorophore has been formed and 
those where a functional fluorophore has not been formed, the fonmation of said functional 
fluorophore being Indicative of an interaction having occurred between proteins A and B 
wlttiin the cell. 

In a prefen-ed embodiment of the invention, the sorting in step 2 is done by FACS. 

1 0 Another aspect of the invention relates to a method for assessing the general utility of a 
compound in modulating protein-protein interactions consisting of: 

1 . Contacting a library of interacting proteins A and B within living cells with the 
compound. 

2. Sorting the cells into those where the fluorophore has been disrupted and those 
1 5 where the fluorophore is inteict, the disruption of said fluorophore being indicative of the 

compound having disrupted the interaction between proteins A and B. 

3. Calculating the fraction of interaction pairs A-B that has been split by the 
compound by calculating the fraction of cells where disruption of the fluorophore has 
occurred upon contact with the compound. 

20 in a preferred embodiment of tiie invention, the sorting in step 2 is done by FACS. 

Yet anottier aspect of the Invention relates to a metiiod for determining the specific 
protein-protein interactions inhibited by a compound comprising: 

1 . Contacting a library of interacting proteins A and B within living cells with the 
25 compound. 
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2. Sorting the cells into those where the fluorophore has been disrupted and those 
where the fluorophore Is intact, the disruption of said fluorophore being indicative of the 
connpound having disrupted the Interaction between proteins A and B. 

3. Detemnlning the identity of Interacting pairs A-B split by the compound. 

5 in a preferred embodiment of the Invention, the sorting In step 2 is done by FACS. 

As described in e.g. WO98/45704, the measurement of translocation of protein X from 
one site to another reveals crudai information of the cellular dynamics. One aspect of the 
1 0 invention relates to an assay for measuring translocation of protein X comprising the steps 
of: 

1. Transfecting Into a cell two constructs, the first construct comprising the sequence 
encoding an anchor fused to a zipper sequence fused to the first half of the fluorophore, 
the second construct comprising protein X fused to a zipper sequence fused to the 

1 5 second half of the fluorophore. 

2. Induce translocation of X to the anchor-site. 

3. Monitor the increase In fluorescence caused when the zippers bring the two halves of 
the fluorophore in close apposition, and the fluorophore emits light upon excitation. 

An example of such use is Split fluorophore complementation as a redistribution sensor: 
20 here one cell line that stably expresses i-listone-zlpper'-GFPa fusion Is made. Into that cell 
line the nuclear transiocator ie p65-zipper-GFPb is transfected. Upon translocation of p65 
into the nucleus the two zipper sequences vM cause GFPa and GFPb to fold and mature 
and be fluorescent. 

The present invention includes as anchor components for the method any and all 
25 genetically encodable cellular components that have a defined cellular distribution. 

Anchor systems can be designed to achieve redistribution to compartments or locations 
within cells where the proteins of interest will experience the influences that would 
normally be required to modulate the interaction between those proteins. As an example, 
some proteins normally require to be phosphoryiated or dephosphorylated by enzymes 
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sequestered in the plane of the plasma membrane - for such proteins of interest it is 
appropriate to choose an anchor component that would be expected to be confined to the 
plasma membrane, to allow the interacting proteins to be appropriately modified. Thus, in 
one embodiment, a preferred anchor component that will target the anchor conjugate to 
5 the plasma membrane is a protein containing the transmembrane domain of the 

epidermal growth factor receptor (EGFR), or containing the transmembrane domain of a 
protein from the integrin protein family, or containing the myristoylation sequence from o 
Src (residues 1-14). 

In another embodiment, a histone protein Is used as the anchor, or a protein normally 
1 0 restricted to nucleoli, for example the p1 20 nucleolar protein, in order to direct the 
anchoring conjugate to the nucleus. 

In another embodiment, the anchor protein Is chosen from those proteins normally 
confined to mitochondrial outer or Inner membranes for example VDAC, Fo subunit of 
ATP-ase, or NADH dehydrogenase. In another embodiment, the anchor protein Is chosen 

15 from the group of proteins nonmally confined to the various different regions of Golgi 
bodies for example TGN38 or ADAW 2-L In another embodiment, the anchor protein is 
chosen from the group of proteins normally confined to focal adhesion complexes for 
example P125. FAK, integerin alpha or beta, or paxillin. in another embodiment, the 
anchor protein is chosen from the group of proteins normally associated with cytosl<eletal 

20 structures such as F-actln strands or micro tubular bundles for example MAP4, kinesins, 
myosins or dyniens. 

One aspect of the invention relates to multiplexing split fluorophore complementation 
using different colours. By combining one fluorescent protein fragment with two or more 

25 appropriate complementary fragments. It is possible to detemilne the extent of binding of 
the first fluorescent protein fragment to either of the two other complementary fragments if 
the two possible fluorescent complexes have distinct fluorescence excitation or emission 
characteristics or both. Typically, the first fluorescent protein fragment will be fused to a 
protein that may bind to either of three other proteins each of them being appropriately 

30 fused to distinct complementary fragments. For example, the first fluorescent protein 
fragment can be a C-tenmlnal fragment of enhanced GFP (EGFR. SEQ iD NO: 1) 
obtained by splitting EGFP after residue 80. The three complementary fragments can be 
appropriate N-termlnal flragments of EGFP, of EGFP Y66W (SEQ ID NO: 2) and EGFP 
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Y66H (SEQ ID NO: 3), respectively. The three EGFP variants have different spectral 
characteristics: 



Fluorescent protein 


Excitation max 


Emission max 


(nm) 


(nm) 


QFP 


386 


508 


GFPY66W 


382 


448 


GFPY66H 


458 


480 



Reference: Heim. R.. Prasher. D.C., and Tsien, R.Y. (1994) Wavelength mutations and 
5 posttranslational autoxidation of green fluorescent protein. Proc. Natl. Acad. Sd .U. S. A. 
91, 12501-12504. 

For example, fluorescent complexes produced by assembling a C-termlnal half of GFP 
(e.g. residues 158 to 238) with corresponding N-temrUnal halves (e.g. residues 1-157) of 
GFP, GFP Y66W, or GFP Y66H will have dearly distinct fluorescence characteristics and 
1 0 the relative amounts compieses in mixtures can be calculated. 

Not only vartous colours can be used as exemplified above, but other physical parameters 
of the fluorophore can be altered e.g. Intensity, fluorescent life-time, folding time etc. 

Intracellular signalling is a highly dynamic process where flie signalling proteins 
15 translocate and undei^o reversible interacUons witii different binding partn^. It is liicely 
fliat several of the (^osolic translocations occur by passive diffusion, and Is controlled by 
tiie conditional change of affinities between ttie specific signalHng protein and its binding 
partners (Temsai. iW.N. & ly^eyer, T. Cell 104, 181 (2000)). This system is in a dynamic 
equlDbrium where only minor changes may lead to net movement of ttie signalling protein 
20 towards anottier binding partner. To study ttiese phenomena In a physiological meaningful 
way, tiiey need to be performed in a celiuiar environment in the present invention, we 
have expressed multiple binding partners fused to half a fluorophore together with the 
signalling protein fused to tiie ottier half of the fluorophore. The binding partners will be 
fused to parts of the fluorophore containing different mutations that upon complementation 
25 with the other half of ttie fluorophore (fused to the signalling protein) will exert different 
spectral properties (e.g. excitation, emission, fluorescence life time a.o.). This will allow 
quantification of how the signalling protein is divided between tiie different binding 
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partners. Also, the change of the equHlt)rlum can be accessed, caused e.g. by a drug or 
putative drug. 

An example of this Is the MARK signalfing pathways relying on sequential binding event 
between multiple enzymes. Taking the Erk pathway Ras will bind to and activale MEKK1 . 

5 which will bind to and activate MEK1. which wQI bind to and activate Erk1. Fusing MEKK1 
and Erk1 with two halves of a fluorophore carrying different mutations that upon binding to 
MEK1 that is fused to the other part of the fluorophore will allow monitoring of the 
equilibrium between MEKK1:MEK1 binding versus MEK1:Eric1 binding by spectral 
analysis of the fluorescence from a cell population. Additional Information on the specific 

10 locarisation of the binding events is obtained by perfomnlng the detection in a microscope 
system with spedral analysis capabilities. 

The two colour multiplexing has several uses: In most cases, as described above, the 
protein of Interest Is linked to ttie "constant" half of tt»e fluorophore whereas Its Interaction 
15 partners each are linked to "variable" parts of the fluorophore e.g. one that upon fusion 
gives rise to a green fluorophore and one fliat gives rise to a blue fluorophore. This can 
give spatial Information, le the two different interaction partners are In different locations 
so colour vtflll ten you virtiere your protein of interest is. The interaction partners could also 
be hi the same locatton so here colour gives you an Indication of which interaction partner 
20 your protein prefers at any given time. Finally, as a special case of the latter, if your 
protein's Interaction wltti the different partners is modulated by eg posttransiational 
modifications, colour can tell you whettier your protein Is modified or not. These three 
different setups can be used either as sensors for the physical state of your protein In the 
broadest possible sense, or they can be used as screening assays where you measure 
25 the ability of the test compounds to alter ttie ratio between tt^e two colour readouts. 
Flnany, colour Is only one physical parameter of GFPs. Ottier physical paranrotere that 
can be localised to specific amino adds In the GPP sequence and that are easily 
detectable, such as absorption spectra, fluorescence lifetime, time for fluorophore 
maturation etc.. could be employed In exactiy ttie same way as colour. The number of 
30 different interaction partners need not be limited to two. 

One special example shows how this metiiod may be employed as a tool to use spatial 
Infbmiaflon for screening, even If nothing Is known about ttie mteracHon partners of a 
particular protein In different compartments (or if It has no Interaction partners In one or 
more of those compartments). Cystic Fibrosis Is pertiaps ttie most firequent and well- 



Studied protein trafficking disease. The cystic fibrosis transmembrane conductance 
regulator (CFTR) is a multi-membrane spanning protein that normally funcBons at the 
apical plasma membrane of airway epithelial cells as a CI-efBux channel. The most 
common mutation. (DeltaF508) causes the protein to be retained In the endoplasmic 

5 reticulum (ER) and so reduces the amount of CFTR expressed in the plasma membrane 
of epithelial cells, resulting In decreased CI- efflux finom the cefls. ft would appear from 
numerous studies that this ER retention defect of DeltaF508 Is reversible, and reduced 
temperature, some small molecules, and Induction of "chaperones" anow DeltaF508 to 
traffic to the plasma membrane and Increase CI- permeability. One way of screening for. 

1 0 compounds that modify mutant CFTR behaviour employs the split fluorophore /multiple 
colours concept. You could express the mutant CFTR In cells as a fusion with a zipper 
fragment fused to the "constanf half of the fluorophore. Expression In the same cell 
fusions of ER and plasma membrane markers with zippers fused to different colours will 
rBveal when the CFTR mutant reaches the ER, this gives rise to one colour. If CFTR Is 

15 moved to the PM e.g. by drug stimulus, that would give rise to a different colour. So by 
screening lor compounds that favours the generatton of the "PM colour", you could find 
molecules that specifically correct the ER retention defect of CF patients. 

One aspect of the Invention thus relates to a method for generating a library of interacting 
20 proteins within living cells consisting cX: 

1 . Introducing Into a pool of cells two sets of plasmlds. either slrhultaneously or 
sequentially, one set of plasmlds encoding a library of proteins A each fused to the N- 
termlnal half of the complementing fluorophore and the second set of plasmlds encoding a 
library of proteins B each fused to the C-terminal half of the complementing fluorophore. 

25 2. Sorting the cells by Imaging Into those where a functional fluorophore has been 
fomied in the right compartment and those where a functional fluorophore has not been 
formed or those where a functional fluorophore has been fbrnied but In the wrong cellular 
compartment, the formation of said functional fluorophore being indicative of an 
interaction having occurred between proteins A and B within the right cellular 

30 compartment. 



For example, for a protein moving from one compartment to another compartment 
response to a stimuli, the binding partners In each compartment can be identified. 
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In one embodiment of the Invention, the method relates to Identification of drugs that will 
cause disruption of binding between two proteins when located in one cellular 
compartment but not in another cellular compartment. This embodiment Is carried out 
5 essentially as described above with the only difference that Instead of sorting the cells 
based on Intensity, the cells are Imaged with standard Imaging equipment to determine 
not only if binding has taken place, but also where such binding has occurred. 



10 This system is also useful for screening for fluorophores witii novel properties such as 
tiiose Oiat can be used in the split fluorophore complemensatlon usages described above. 
By mutagenlsis of both the N- and C-tenninI of GFP In Oils system, the system is used to 
screen in a combinatorial manner for double (or more) mutants of the fluorophore with 
novel properties. This gives a wider selection to choose from. Furthermore, as the two 

15 mutations are In different halves of the molecule, they could be additive or compensatory, 
or bofli. Finally, tiie faca that tiiey have been found using ttie spilt fluorophore 
complementation system Immediately means that they can be used as sensors In this 
system. 

20 Based on CaM and M13 each fused to a half fluorophore, the amount of Ca can be 
determined as fluorescence. This principle can be applied to other systems, where the 
presence of one component will be reflected by the binding of two proteins (dlvatent metel 
ions are luiown for doing this). 
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Examples 



Example 1: Alignment of fluonscent proteins 



Genbank entry 


Fluorescent protein 


P42212 


Aequorea victoria green-fluorescent protein 


AF372525 


Renilla renlformis green fluorescent protein 


AYOISSSe 


Renllla muelleri green fluorescent protein 


AY013824 


Aequorea macrodactyla Isolate GFPxm 


AF384683 


Montastraea cavernosa green fluorescent protein 


AF401282 


Montastraea faveolata green fluorescent protein 


AY015995 


Ptilosarcus sp. 030-2001 green fluorescent protein 


AF322221 


Anemonia sulcata green fluorescent protein asFP499 


AF322222 


Anemonia sulcata nonfluorescenl red protein asCP562 


AF246709 


Anemonia sulcata GFP-IIke chromoprotein FP595 


AF168419 


DsRed Discosoma sp. fluorescent protein FP583 


AF168420 


Discosoma striata fluorescent protein FP483 


AF168421 


Anemonia majano fluorescent protein FP486 


AF1 68422 


Zoanthus sp. fluorescent protein FP506 


AF1 68423 


Zoanthus sp. fluorescent protein FP538 


AF1 68424 


Clavularia sp. fluorescent protein FP484 
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P42212 avGFP MSKGEBLPTGVWILVELDQDV 22 

AY015996 rmGFP MSKQILKNTCLQEVNSYKVNLBGIV 25 

AF372525 rrOPP MD---IAKIiGLKEVMPTKim.BGLV 22 

AF168419 dsRed MRSSIQIVIKBFMRPKVRKEGTV 22 

5 AF322222 a8CP5€2 - MASFUCKTMPPKTTIBGTV 19 

AF168422 SiBCl?S€2 • MftQSKHOXiTKEMTMKyKMEGGV 22 

AF246709 a8FP595 - MASPLKKTMPFKTTIEGTV 19 

AF322221 a8FP499 — - MYPSIKETNRVQLSMEGSV 19 

AF384683 mcOPP -HSVIKPIMEIKWMQGW 18 

10 AF401282 mfGFP - -M5VIKPI»4KIKLRMKAV 18 

AF168424 CBFP484 HRCKFVFOLSFLVlAXTNANIFLRNEADLEEia^ 60 

AF168420 dsFP483 MSCSKSVIKEEMLIDLHLEGTF 22 

AY015995 spGFP MNRNVLKNTGLKBIMSAKASVEGIV 25 

AF168423 ZSFP538 MAHSKHGIiKEEMTMKYHMEGCV 22 

15 AF168421 amFP486 MALSNKFIGDDMKMTYHMDGCV 22 

AY013824 amGFPxm - — - MSKGEELFTGIVPVLIELDGDV 22 

: : ! ;* . 

P42212 avGFP NGHKFSVSGEGEGDATYGKIi--TLKFICTTG-KLPVPWPTLVTTFSYGVQCFSRyPDHMK 79 
20 AY01599e nnGFP NNHVFTMEGCGKGNILFGNQ- -LVQIRVTKGAPLPFAFDIVSPAFQYGNRTFTKYPNDIS 83 
AF372525 rrGFP GDHAFSMEGVGEGNILEGTQ- -EVKISVTKGAPLPFAFDIVSVAFSYGNRAYTGYPEEIS 80 
70^168419 dsRED NGHEFEIEGEGEGRPYEGHN--TVKLKVTKGGPLPFAWDILSPQFQYGSKVYVKHPADIP 80 
AF322222 asCP562 NGHYFKCTGKGEGNPPEGTQ--EMKIEVIEGGPLPPAFHILSTSCMYGSKTFIKYVSGIP 77 
AF168422 asCP562 DGHKFVITGEGIGYPFKGKQ--AINLCVVEGGPLPFAEDILSAAFNYGNRVPTEYPQDIV 80 
25 AF246709 asFP595 NGRYPKCTGKGEGNPFEGTQ--EMKIEVIEGGPLPFAPHILSTSC:!MYGSKTFIKYVSGIP 77 
AF322221 asFP499 NYHAPKCTGKGEGKPYEGTO--SIiNITITEGGPLPPAFDILSHAFQYGIKVPAKYPKEIP 77 
AF384683 mcGFP NGHKFVIKGEGEGKPFEGTQ--TTNLTVKEGAPLPFAYD1LTSAFQYGNRVFTKYPDDIP 76 
AF401282 mfGFP NGHKFVIEGDGKGKPFEGTQ- -SMDLTVKEGAPLPPAYDILTTVFDYGNRVFAKYPOMP 76 
AF168424 C8FP484 NGHAFVIEGEGEGKPYDGTH--TI<in:iEVKEGAPLPFSYDILSNAFQYGNRALTKYPDDIA 118 
30 AF168420 d8FP483 NGHYF8IK3XOK<3QPNEGTN--TVTLEVTKGGPLPFGWHILCPQFQYGNK^^ 80 
AY015995 spGFP MNHVFSMEGFGKGNVLFGNQ--I;NQIRVTRGGPLPFAFDIVSIAFQYGN^ 83 
AF168423 Z8FP538 NGHKFVITGEGZaYPFKGKQ--TINZiCVIEGGPLPFSEDIIiSAGFKYGDRIFTEYPQD^ 80 
AF168421 ainFP486 NGHYFTVXGEGOrGKPYBGTQTSTFKVTMANGGPLAFSFDILSTVFKYGm 82 
AY013824 amGFPXM HGHKFSVRGEGE(^:)I^YGKL--EIKFICTTG-KLPVPWPTLVTT^ 79 

35 * * * * « * , , * •* , , • ** t s • 
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P42212 
134 

Ay015996 
5 AF372525 
AF168419 
AF322222 
AF168422 
AF246709 

10 AF322221 
AF384683 
AF401282 
AF166424 
AF168420 

15 AY015995 
AF168423 
AF168421 
Ay013824 

20 

P42212 
AY015996 
AF372525 
AF168419 

25 AF322222 
AF168422 
AF246709 
AF322221 
AF3B4683 

30 AF401282 
AF168424 
AF168420 
AY015995 
AF168423 

35 AF168421 
Ay013824 



avGFP QHDFPKSAMPE(mrQERTIFFKDI>GKnriCTRAEV- -KFEO DTLVHRIBLKGIDFKBDG 

rmGFP --DyFlOSFPAGFMyERTLRYEDGGLVBIRSDI--llljIE DKFVYRVBYKOSNFPDDG 136 

rrGFP --DYFLOSFPEaFTyERNIRYQDGGTAIVKSDI--SLED GKPXVNVDFKAKDliRRMG 133 

dsRED .-DYKKIiSFPEGFKWERVMNFEDGGVVTVTQDS--SIiQD GCFIYKVKFIGVNFPSDG 133 

a8CP562 --DYFKQSFPBGFTWERTTTYEDGGFLTAHQDT--SLDG DCLVYKVKILGNNFPAIX5 130 

asCP562 --DYFKNSCPAGYTWDRSFLFEDGAVCICNADITVSVEEN CMYHESKFYGVNFPADG 135 

a8FP595 --DYFKQSFPEGFTWERTTTYEDGGFI*TAHQDT--SLDG DCLVYKVKILGNNFPADG 130 

a8FP499 --DFFKQSI1PGGFSWERVSTYEDGGVLSATQET--SLQG---DCIICKVKVLGTNFPANG 130 
mcGFP - -DYFKQTFPEGYSWERIMAYEDQSICTATSDI - - KMEG- — DCFI YEIQFHGVNFPPNG 12 9 
mfGFP --DYFKQTFPEQYSWERSMTYEDQGICVATNDI--TIjMKGVDDCFVYKIRFDGVOTP/^ 132 
CSFP484 --DYFKQSFPEGYSWERT^f^FEDKGIVKV^KBDI--SMEE---DSFIYEIRFDGMNFPPNG 171 

dsFP483 --DYLKLSFPEGYTWERSMHFEDGGLCCITNDI--SLTG NCFYYDIKFTGLNFPPNG 133 

spGFP - -DYFVQSFPAGFFYERNLRFBDGAIVDIRSDI - - SliED DKFHYKVEYRGNGFPSNG 136 

ZSFP538 --DYFKNSCPAGYTWGRSFLFEDGAVCICNVDITVSVKEN CI YHKSI FNGMNFPADG 135 

amFP486 --DYFKQAPPD<»ISYERTFTYEDGGVATASWEI--SLKGN CFEHKSTFHGVNFPADG 135 

amGFPXM MNDFFKSAMPEGYIQERTIFFQDDGKyKTRGBV--KFEG DTLVHRIELKGMDPKEDG 134, 

*- :** * i;*. : 5 ..f ♦ 

avGFP NILGHKLEYNYNSHNVYIMADKQKMGIKVNFKIRHNIEDGSVQLADHYQQ^ 192 
rmGFP pVM-QKTILGIEPSFBAMYMN--NGVLVGBVIliVyKIiNSGI^ 190 

rrGFP pVM-QQDIVGMQPSyESMYTK--VTSVIGBCIIAFKX<QTGKHFTYHMRTV YKSKKPV 187 

dSRED PVM-QKKTMGWBASTBRLYPR--DGVLKGErHKRLKLiaDGGHYLVE^^ YMAKKP- 186 

a8CP562 P RDAEQS- 137 

a8CP562 PVM-KK^m>NWBPSCEKIIPVPKQGILKCH)VSMYLLllKDGQRI^ YKAKSVP 191 

a8FP595 PVM-QOTOlGRWEPATEIVYEV--DGWiRGQSIiMAI«CPGGRHI*TC^^ 186 

a8FP499 PVM-QKKTa5WEPSTETVIPR--DGGLLLRDTPAI)MLADGGHLSCFM^ YKSKKE- 

mcGFP PVM-QKKTLKWEPSTEKMYVR- -TCVLKGDVNMALLLEGGGHYRCDFRST YKAKKR- 

mf GFP PVM-QKKTLKWEPSTEKMYVR- -DGVLKGDVNMALLLEGGGHYRCDPKTT YKAKKF- 

C8FP4 84 PVM-QKKniKWEPSTEIMYVR- -DGVLVQDISIESLLIiEGGOTYRCDPKSI YKAKKV- 

dsFP483 PW-QKKTTGWEPSTERLYPR--DGVLIGDIHHAI»TVEGGGHYACDlRrV--rYIUaC^^ 187 

spGFP PVM-Qia^IIlGMEPSPEVVY^IN--SGVLVGEVDLVYKLESGNYYSC3M^ YRSKGGV 190 

ZSFP538 PVM-KKOTTmJEASCEKIMPVPKQGILKGDVSMYLLLKDGGRYRCQFDTV YKAKSVP 191 

amFP486 PVM-AKKTTGWDPSFEKMTVC--DGILKQDVTAFLMLQGGGNYRCQFHTS YKTK-KP 188 

amGFPXM NXLGHKLEYNFNSHNVYIfn?DKAIWGIiKVNFKIRHNIEGGGVQrJ^OT^ 192 



183 
182 
185 
224 



P42212 avGFP VLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITHCaiDELyK 238 

40 AY015996 rmGFP KEFPSYHFIQHRLEKTYV-EDGG-FVEQHETAIAQMTSIGKPLGSLHEWV 238 

AF372525 rrGFP ETMPLYHFIQHRLVKTNV-DTASGYWQHETAIAAHSTIKKIEGSLP 233 

AF168419 dsRED VQLPGYYYVDSKLDITSH-NEDYTIVEQYERTEGRHHLFL 225 

AF322222 asCP562 RKMG- ASHRDTL 148 

AF168422 asCP562 RKMPDWHFIQHKLTREDRSDAKNQKWHLTEHAIASGSALP 231 

45 AF246709 a8FP595 LKMPGFHFEDHRIBIMBE-VEKGKCYKQYEAAVGRYCDAAPSKliGHNr--- 232 

AF322221 a8FP499 VKLPELHFHHIiRMEKLNI-SDDWKTVBQHESWASYS-QVPSKIiGHN 228 

AF384683 mcGFP VQLPDYHFVDHRIEILSH-DNDYNTVKLSEDAEARYSMLPSQAK 225 

AF401282 mfGFP VQLPDYHFVDHRIEILSH-DKDYNKVKLYEHAEA-HSGLPRQAK 227 

AF168424 C8FP484 VKLPDYHFVDHRIEILNH-DKDYNKVTLYENAVARYSLLPSQA 26^ 

50 AF168420 dBFP483 I^MPGYHYVDTKLVIWNN-DKEFMKVEEHEIAV/aOTIPFYEPKKDK 232 

AY01599S spGFP KEFPBYHFIHHRIiEKTYV-EEGS'FVEQHETAIAQLTTIGKPLGSMEWV 238 

AP168423 ZBFP538 SKMPBWHPIQHKLLREDRSDAKNQKWQLTEHAIAFPSAIiA 231 

AF168421 amFP486 VTMPPNHWEHRIARTDLDKGGMS-VQLTBHAVAHITSWPF 229 

AY013824 amGFPXM VLIPIlWyi.STQTAISKDRHETRDHMVFLEPPSACaimiGMDEIjYK 238 

55 I 
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SBQ ID 1 Amino acid sequence o£ GFP 

MSKGEELFTGWPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKPICTT 
GKLPVPWPTLVTTPSYGVQCPSRYPDHMKQHDPFKSAMPEGYVQERTIPP 
5 KDDGNYKTRAEVKFBGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNV 
YIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQOTPIGDGPVLLPDNHY 
LSTQSALSKDPNEKRDHMVLLBFVTAAGITHC^ELYK 



10 SEQ ID 2 Amino acid sequence o£ GFP Y66W 

MSKGEELFT6WPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTT 
GKLPVPWPTLVTTFSWGVQCPSRYPDHMKQHDFPKSAMPEGYVQERTIPF 
KDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNV 
15 YIMADKQKNGIKWPKIRHNIEDGSVQIADHYQQNTPIGDGPVLLPD 
LSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELYK 

SEQ ID 3 Amino acid sequence of GFP Y66H 

20 MSKGEELFTGWPILVELDGDVNGHKFSVS6EGEGDATYGKLTLKFICTT 
GKLPVPWPTLVTTFSHGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFF 
KDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNV 
YIMADKQKNGIKVNFKIRHHIEDGSVQLADHYQQNTPIGDGPVLLPDNHY 
LSTQSALSKDPNEKRDHMVLLEFVTAA6ITHGMDELYK 

25 
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